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(54) DEVICE AND METHOD FOR INFORMATION PROCESSING AND RECORDING MEDIUM 

(57)Abstract: 

PROBLEM TO BE SOLVED: To provide a robot-performing 
15 16 operation which is rich in variety. 

SOLUTION: A user's voice which is picked up by a microphone 
15 is recognized by a voice recognition part 3 1 A. A gesture of 
31a the user photographed by a CCD 16 is recognized by an image 
recognition part 3 IB. An action- determining mechanism part 33 
determines the operation of the robot by using the voice 
information outputted from the voice recognition part 31 A and 
the image information outputted from the image recognition part 
31B. 
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DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention is used for a robot which determines actuation as an information 
processor and an approach, and a list especially about a record medium using speech information and image 
information, and relates to a record medium at a suitable information processor and a suitable approach, and 
a list. 
[0002] 

[Description of the Prior Art] Conventionally, as a toy etc., if press actuation of the touch switch is carried 
out, many robots (a sewing basis-like thing is included) which output composite tone, robots which can 
enjoy a false conversation of recognizing a user's utterance and returning a response sentence are produced 
commercially. 

[0003] Moreover, an image is picturized, image recognition is performed based on the picturized image, 
circumstantial judgment is performed, and the robot which acts autonomously is produced commercially 
[0004] J ' 

[Problem(s) to be Solved by the Invention] However, the recognition result by speech recognition had the 
technical problem that incorrect recognition will be carried out, when a user's utterance was not performed 
clearly. Moreover, the technical problem that the object which a demonstrative pronoun directs may be 
unable to be recognized occurred the case of the utterance containing an ambiguous word, for example, the 
utterance containing a demonstrative pronoun. 

[0005] Moreover, it was difficult for the robot which mentioned above to perform autonomous actuation 
only depending on one information on voice or an image, and to perform autonomous actuation using the 
information on voice and an image both. 

[0006] By making this invention in view of such a situation, and using the information on both voice and an 
image, more positive speech recognition is performed and it aims at providing a user with the actuation 
which was rich in variety. 
[0007] 

[Means for Solving the Problem] An information processor according to claim 1 is characterized by 
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including a decision means to opt for actuation of a robot at least using one side among the recognition result 
by speech recognition means to recognize voice, image recognition means to recognize an image, and the 
speech recognition means, or the recognition result by the image recognition means. 

[0008] A maintenance means to hold the table on which the relation of a robot of operation determined as a 
meaning by the recognition result by said speech recognition means, the recognition results by the image 
recognition means, and those recognition results was described can be included further. 
[0009] When the recognition result by the voice means is not determined as a meaning, using the recognition 
result by the image recognition means, said decision means is determined as a meaning and can opt for 
actuation of a robot using the determined recognition result. 

[0010] Said decision means can opt for actuation of a robot using the recognition result determined by the 
meaning using the recognition result by the speech recognition means, when two or more objects exist in the 
image which an image recognition means recognizes. 

[001 1] Said image recognition means detects the direction to which the part beforehand set up among a user's 
finger, the face, the eye, and the jaw points, and can recognize the image located in the direction. 
[0012] Including further a storage means to memorize the data about the gesture which a user performs, by 
recognizing a user's image, an image recognition means detects the gesture memorized by the storage means, 
and can make a recognition result the detected result. 

[0013] It is characterized by also using the measurement result by the measurement means for a decision 
means including a measurement means to measure the distance between a user and self, and opting for 
actuation of a robot further, from the magnitude of a user's face detected by detection means to detect a user's 
face, and the detection means. 

[0014] Said speech recognition means detects the rhythm contained in an environmental sound, and can 
make it a recognition result. 

[0015] From an environmental sound, said speech recognition means detects a sound phenomenon, and can 
make it a recognition result. 

[0016] The information processing approach according to claim 10 is characterized by including the decision 
step which opts for actuation of a robot at least using one side among the recognition result by processing of 
the speech recognition step which recognizes voice, the image recognition step which recognizes an image, 
and a speech recognition step, or the recognition result by processing of an image recognition step. 
[0017] The program of a record medium according to claim 1 1 is characterized by including the decision 
step which opts for actuation of a robot at least using one side among the recognition result by processing of 
the speech recognition step which recognizes voice, the image recognition step which recognizes an image 
and a speech recognition step, or the recognition result by processing of an image recognition step. 
[0018] In an information processor according to claim 1, the information processing approach according to 
claim 10, and a record medium according to claim 1 1, voice is recognized, an image is recognized and it opts 
for actuation of a robot among an audio recognition result or the recognition result of an image at least using 
one side. 
[0019] 

[Embodiment of the Invention] Drawing! shows the example of an appearance configuration of the gestalt 
of 1 operation of the robot which applied this invention, and drawing 2 shows the example of an electric 
configuration. 

[0020] The robot consists of gestalten of this operation by connecting the head unit 4 and the tail section unit 
5 with the front end section and the back end section of the idiosoma unit 2, respectively while considering 
as the thing of a dog configuration and connecting the leg units 3 A, 3B, and 3C and 3D with front and rear, 
right and left of the idiosoma unit 2, respectively. 

[0021] The tail section unit 5 is pulled out free [ a curve or rocking ] with two degrees of freedom from base 
section 5B prepared in the top face of the idiosoma unit 2. The controller 10 which controls the whole robot, 
the dc-battery 1 1 used as a robot's source of power, the internal sensor section 14 which becomes a list from 
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the dc-battery sensor 12 and the heat sensor 13 are contained by the idiosoma unit 2. 

[0022] The microphone (microphone) 15 which is equivalent to a "lug" at the head unit 4, the CCD (Charge 
Coupled Device) camera 16 equivalent to a "eye", the touch sensor 17 equivalent to a tactile sense, the 
loudspeaker 18 equivalent to "opening", etc. are arranged in the predetermined location, respectively. 
[0023] Leg unit 3 A thru/or the joint part of each 3D, and leg unit 3A thru/or each 3D and the joining segment 
of the idiosoma unit 2, In the joining segment of the head unit 4 and the idiosoma unit 2, and a list, to the 
joining segment of the tail section unit 5 and the idiosoma unit 2 As shown in drawing 2 , actuator 3AA1 
thru/or 3AAK(s), 3BA1 or 3BAK(s), 3CA1 or 3CAK(s), 3D Al or 3D AK, four Al or 4AL(s), five Al, and 
five A2 are arranged, respectively. By this Each joining segment can have a predetermined degree of 
freedom, and can rotate it now. 

[0024] The microphone 15 in the head unit 4 collects the voice (sound) of a perimeter including the utterance 
from a user, and sends out the acquired sound signal to a controller 10. CCD camera 16 picturizes a 
surrounding situation and sends out the acquired picture signal to a controller 10. 

[0025] The touch sensor 17 is formed in the upper part of the head unit 4, detects the pressure received by "it 
strokes" and the physical influence of "striking" from a user, and sends it out to a controller 10 by making the 
detection result into a pressure detecting signal. 

[0026] The dc-battery sensor 12 in the idiosoma unit 2 detects the residue of a dc-battery 1 1, and sends out 
the detection result to a controller 10 as a dc-battery residue detecting signal. The heat sensor 13 detects the 
heat inside a robot, and sends out the detection result to a controller 10 as a heat detecting signal. 
[0027] The controller 10 contains CPU(Central Processing Unit)10A, memory 10B, etc., and performs 
various kinds of processings by performing the control program memorized by memory 10B in CPU 1 OA. 
That is, a controller 10 judges existence, such as a surrounding situation, and a command from a user, 
influence from a user, based on the sound signal given from a microphone 15, CCD camera 16 and a touch 
sensor 17, the dc-battery sensor 12, and the heat sensor 13, a picture signal, a pressure detecting signal, a 
dc-battery residue detecting signal, and a heat detecting signal. 

[0028] Furthermore, a controller 10 opts for the continuing action based on this decision result etc. The 
required thing of actuator 3AA1 thru/or 3AAK(s), 3BA1 or 3BAK(s), 3CA1 or 3CAK(s), 3D Al or 3D AK, 
four Al or 4AL(s), five Al, and five A2 is made to drive based on the decision result. By this The head unit 
4 can be made to be able to shake vertically and horizontally, the tail section unit 5 can be moved, or each 
leg unit 3 A thru/or 3D are driven, and it makes it act to walk him around a robot etc. 

[0029] Moreover, if needed, a controller 10 generates composite tone, and it is made to supply and output to 
a loudspeaker 18, or it turns on, switches off or blinks LED (Light Emitting Diode) which was prepared in 
the location of the "eyes" of a robot and which is not illustrated. 

[0030] A robot can take action now autonomously based on a surrounding situation etc. as mentioned above. 
[0031] Next, drawing 3 shows the example of a functional configuration of the controller 10 of drawing 2 . 
In addition, the functional configuration shown in drawing 3 is realized because CPU 1 OA performs the 
control program memorized by memory 10B. 

[0032] A controller 10 accumulates the recognition result of the sensor input-process section 31 which 
recognizes a specific external condition, and the sensor input-process section 31 etc. It is based on the 
recognition result expressing the condition of feeling and instinct of feeling / instinct model section 32, and 
the sensor input-process section 31 etc. It is based on the decision result of the action decision mechanism 
section 33 which opts for the continuing action, and the action decision mechanism section 33. It consists of 
the controlling mechanism section 35 which carries out drive control of the posture transition device section 
34 which makes a robot actually take action, each actuator 3AA1 or five Al, and five A2, the speech 
synthesis section 36 which generates composite tone, and the acoustical-treatment section 37 which controls 
the output of the speech synthesis section 36 in a list. 

[0033] Based on a microphone 15, CCD camera 16, the sound signal given from touch sensor 17 grade, a 
picture signal, a pressure detecting signal, etc., the sensor input-process section 31 recognizes a specific 
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external condition, the specific influence by the user, the directions from a user, etc., and notifies the 
condition recognition information that the recognition result is expressed to feeling / instinct model section 
32, and the action decision mechanism section 33. 

[0034] Namely, it has speech recognition section 31 A, and, as for the sensor input-process section 31, speech 
recognition section 3 1 A performs speech recognition using the sound signal given from a microphone 1 5 
according to the control from the action decision mechanism section 33. And speech recognition section 31 A 
notifies the command and others as the speech recognition result, such as "walk", "lie down", and "pursue a 
ball", to feeling / instinct model section 32, and the action decision mechanism section 33 as condition 
recognition information. 

[0035] Moreover, it has image recognition section 3 IB, and, as for the sensor input-process section 31, 
image recognition section 31B performs image recognition processing using the picture signal given from 
CCD camera 16. and image recognition section 31B - the result of the processing - for example, ~ "-- when 
flat-surface" more than the ****** height in a perpendicular etc. is detected to round red thing" and "ground, 
an image recognition result, such as "there is a ball" or there "there being a wall", is notified to feeling / 
instinct model section 32, and the action decision mechanism section 33 as condition recognition information. 
Moreover, recognition of gesture which a user performs is also performed and the recognition result is 
notified to the action decision mechanism section 33. 

[0036] Furthermore, the sensor input-process section 31 has pressure processing section 31C, and pressure 
processing section 31C processes the pressure detecting signal given from a touch sensor 17. and when it is 
beyond a predetermined threshold and a short-time pressure is detected as a result of the processing, pressure 
processing section 31C It is recognized as "It was struck" (cut by carrying out), it is under a predetermined 
threshold, and when the pressure of long duration is detected, it is recognized as "It was stroked" (praised) 
and the recognition result is notified to feeling / instinct model section 32, and the action decision 
mechanism section 33 as condition recognition information. 

[0037] Feeling / instinct model section 32 has managed the feeling model and instinct model expressing a 
robot's feeling and the condition of instinct, respectively. Based on the condition recognition information 
from the sensor input-process section 31, the feeling / instinct status information from feeling / instinct 
model section 32, time amount progress, etc., the action decision mechanism section 33 opts for the next 
action, and sends it out to the posture transition device section 34 by making into action command 
information the contents of the action for which it opted. 

[0038] The posture transition device section 34 generates the posture transition information for making a 
robot's posture change into the following posture from a current posture based on the action command 
information supplied from the action decision mechanism section 33, and outputs this to the controlling 
mechanism section 35. The controlling mechanism section 35 generates the control signal for driving 
actuator 3AAI thru/or five Al, and five A2 according to the posture transition information from the posture 
transition device section 34, and sends this out to actuator 3AA1 thru/or five Al, and five A2. Thereby, 
actuator 3AA1 thru/or five Al, and five A2 are driven according to a control signal, and a robot takes action 
autonomously. 

[0039] A robot 1 recognizes a user's voice and gesture and opts for action. From the example of a functional 
configuration shown in drawing 3 , a user's voice and gesture are recognized and what took out the part for 
opting for action is shown in drawing 4 . That is, in order to recognize a user's voice and to recognize the 
gesture of a microphone 15, speech recognition section 31 A, and a user, it has CCD 16 and image recognition 
section 3 IB, and the action decision mechanism section 33 opts for a robot's 1 action by the recognition 
result obtained from speech recognition section 3 1 A and image recognition section 3 1 B. 
[0040] Drawing 5 is drawing showing the detailed configuration of speech recognition section 31 A. A user's 
utterance is inputted into a microphone 15 and the utterance is changed into the sound signal as an electrical 
signal on a microphone 1 5. This sound signal is supplied to the AD (Analog Digital) transducer 51 of speech 
recognition section 31 A. In the AD translation section 51, the sound signal which is an analog signal from a 
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microphone 15 is sampled and quantized, and it is changed into the voice data which is a digital signal. This 
voice data is supplied to the feature-extraction section 52. 

[0041] About the voice data from the AD translation section 51, for every suitable frame, the 
feature-extraction section 52 extracts feature parameters, such as a spectrum, and linear predictor coefficients, 
a cepstrum multiplier, a line spectrum pair, and supplies them to the characteristic quantity buffer 53 and the ' 
matching section 54. In the characteristic quantity buffer 53, the feature parameter from the 
feature-extraction section 52 is stored temporarily. 

[0042] The matching section 54 recognizes the voice (input voice) inputted into the microphone 15, referring 
to the sound model database 55, the dictionary database 56, and the syntax database 57 if needed based on 
the feature parameter from the feature-extraction section 52, or the feature parameter memorized by the 
characteristic quantity buffer 53. 

[0043] That is, the sound model database 55 has memorized the sound model showing the acoustical 
descriptions, such as each phoneme in the audio language which carries out speech recognition, and syllable. 
Here, as a sound model, HMM (Hidden Markov Model) etc. can be used, for example. The dictionary 
database 56 has memorized the word dictionary in which the information about the pronunciation was 
described about each word for recognition. The syntax database 57 has memorized the syntax rule each word 
registered into the word dictionary of the dictionary database 56 described it to be how it was carrying out a 
chain (connected). Here, as syntax rule, a context free language (CFG) and the regulation based on a 
statistical word chain probability (N-gram) etc. can be used, for example. 

[0044] By referring to the word dictionary of the dictionary database 56, the matching section 54 is 
connecting the sound model memorized by the sound model database 55, and constitutes the sound model 
(word model) of a word, furthermore, the word model which connected the matching section 54 by referring 
to the syntax rule memorized by the syntax database 57 in some word models, and was connected by making 
it such - using - a feature parameter - being based - for example, HMM - the voice inputted into the 
microphone 15 is recognized by law etc. And the speech recognition result by the matching section 54 is 
outputted in a text etc. 

[0045] In addition, when to process again for the inputted voice is required, the matching section 54 
processes using the feature parameter memorized by the characteristic quantity buffer 53, and, thereby, needs 
to require utterance for the second time of a user. 

[0046] Drawing 6 is drawing showing the internal configuration of image recognition section 3 IB. The 
image picturized by CCD 16 is inputted into the AD translation section 61 of image recognition section 31B, 
is changed into digital image data, and is outputted to the feature-extraction section 62. The 
feature-extraction section 62 performs feature extractions, such as edge detection of an object, and 
concentration change of an image, from the inputted image data, and calculates characteristic quantity, such 
as a feature parameter or a feature vector. 

[0047] The characteristic quantity extracted by the feature-extraction section 62 is outputted to the face 
detecting element 63. The face detecting element 63 detects a user's face from the inputted characteristic 
quantity, and outputs the detection result to the distance test section 64. The distance test section 64 measures 
the sense of a face while measuring distance with a user using the detection result outputted from the face 
detecting element 63. The measured measurement result is outputted to the action decision mechanism 
section 33. 

[0048] In addition, the distance with a user can be measured from change of the magnitude of a face. For 
example, it is possible to carry out by using the approach currently indicated by "Nerual Network-Based 
Frace Detection Henry A.Rowley, Shumeet Baluja, and and Takeo Kanade IEEE Pattern Analysis and 
Machine Intellegence." 

[0049] Moreover, in the gestalt of this operation, although explained measuring magnitude of a face using 
one image input, distance with a user may be measured by performing matching during two image inputs 
(stereo image). It is Possible to Carry Out by Using Approach Currently Indicated by "Section 3.3.1 Point 
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Pattern-Matching Image-Analysis Handbook Mikio Takagi and Akihisa Shimoda Editorial-Supervision 
University of Tokyo Press" about Extract of Three-Dimension Information from Stereo Image, for Example. 
[0050] While the characteristic quantity extracted by the feature-extraction section 62 is outputted to the face 
detecting element 63, it is outputted also to the matching section 65. The matching section 65 outputs the 
recognition result obtained by comparing the inputted characteristic quantity with the pattern information 
memorized by the standard-pattern database 66 to the action decision mechanism section 33. The data 
memorized by the standard-pattern database 66 are data in which the image data of gesture and the 
description of a pattern of operation are shown, in addition, recognition Robotics Society of Japan of a 
sensibility expression according to "gesture for example about gesture recognition - it is possible to use the 
approach currently indicated by Vol. 17 NO.7 933 page thru/or 936 pages, and 1999." 

[0051] Thus, the recognition result outputted from speech recognition section 31 A and the recognition result 
(measurement result) outputted from image recognition section 31B are inputted into the action decision 
mechanism section 33. Drawing 7 is drawing showing the internal configuration of the action decision 
mechanism section 33. The recognition result of the voice outputted from speech recognition section 31 A is 
inputted into the text analysis section 71 of the action decision mechanism section 33. The text analysis 
section 71 extracts language information, such as information on a word, and information on functor, by 
analyzing the speech recognition result inputted based on the data memorized by the dictionary database 72 
and the syntax database 73 for analysis for morphological analysis, syntax analysis, etc. Moreover, the 
semantics [ an input ] of voice utterance, an intention, etc. are extracted based on the contents described by 
the dictionary. 

[0052] That is, information, such as part-of-speech information required for the dictionary database 72 in 
order to apply the notation and the syntax for analysis of a word, the semantic information according to 
individual of a word, etc. are memorized, and the data which described the constraint about a word chain 
based on the information on each word memorized by the dictionary database 72 are memorized by the 
syntax database 73 for analysis. The text analysis section 71 analyzes the inputted text data of a speech 
recognition result using these data. 

[0053] The data memorized by the syntax database 73 for analysis are data required for text analysis using a 
language theory including semantics, such as HPSG, etc., when including even a regular grammar, a context 
free language, statistical word chain establishment, and semantic analysis. 

[0054] The analysis result outputted from the text analysis section 71 is outputted to the keyword extraction 
section 74. From the inputted analysis result, with reference to the data memorized by the keyword database 
75, the keyword extraction section 74 extracts the intention which the user uttered, and outputs the extract 
result to the table reference section 76 of operation. In addition, the data of language in which an intention of 
users, such as an admiration expression and an instruction, is shown are held as a keyword used for the 
keyword database 75 in the case of keyword spotting. Specifically, it is held as the expression which serves 
as an index of speech information in the latter table reference section 76 of operation, and data whose word 
corresponding to it is a keyword. 

[0055] The table reference section 76 of operation is determined by referring to the table memorized by the 
table storage section 77 of operation and the classification table storage section 78 of operation, respectively 
in the actuation for which it opts by the extract result outputted from the keyword extraction section 74, and 
the recognition result outputted from image recognition section 3 IB. Here, the table memorized by the table 
storage section 77 of operation is explained. Drawing 8 is drawing showing an example of the table of 
operation memorized by the table storage section 77 of operation. 

[0056] As a recognition result of an image, "beckoning", "handshaking" "which points at a finger", and when 
[ "which shake a hand" ] there "is" no recognition result of an image, it is classified. It is divided, when the 
measurement result of distance with a user is needed as attendant circumstances with these classifications, 
and when that is not right. Furthermore, actuation is determined by the audio recognition result. 
[0057] For example, when it is "beckoning" as a result the user has recognized the image to be, the 
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information on being separated from where how many it is of the user first etc., i.e., a measurement result, is 
needed, and - a user - beckoning -****- even if - the -- the time - utterance " - here - come - " - it 
is - if - " -- a user - approaching - " - ** - saying - actuation - determining - having - although - " - 
being suitable - it can go - " - etc. - it is - if - " - a user - from - separating -- " - actuation -- 
determining -- having . In addition, although mentioned later for details, even when "coming here" and a user 
speak, it does not necessarily surely opt for actuation of approaching a user. 

[0058] Thus, a table of operation is a table described that a user's gesture (recognition result of an image), a 
user's utterance (audio recognition result), and three information further of distance (measurement result) 
with a user by the situation opt for one actuation. 

[0059] Drawing 9 is drawing showing an example of the classification fable of operation memorized by the 
classification table storage section 78 of operation. A classification table of operation classifies the actuation 
in a table of operation. Actuation of a table of operation can be classified into four kinds as shown in the 
classification table of operation shown in drawing 9 . That is, they are "robot location relative actuation", 
"user location relative actuation", an "absolute position action", and "others." 

[0060] "Robot location relative actuation" is actuation as which the direction of operation and distance are 
determined in a robot's current position, for example, since a user's right-hand side turns into a robot's 1 
left-hand side when a user can go to "right and speaks with ", and the robot 1 and the user have met, a robot 1 
performs as a result actuation it is supposed that is moved to the left. 

[0061] "User location relative actuation" is actuation as which the direction of operation and distance are 
determined in a user's current position, for example, when a user speaks with "come here", a robot 1 judges 
how much it should move, although it goes till the place 80cm before a user, and performs actuation of 
moving according to the decision to it. 

[0062] An "absolute position action" is actuation which does not need the current position of a robot 1 or a 
user, for example, by a user's utterance inviting to "east, when it is ", since a robot 1 is the direction of the 
meaning determined nothing [ say / east ] with regards to a self location and a user's location, it performs 
actuation of moving in the direction. 

[0063] "Others" are actuation which does not need the information on a direction or distance, for example, 
are that a robot 1 utters voice etc. 

[0064] Next, the method of the decision of a robot 1 of operation made in a robot 1 is explained. As 
mentioned above, actuation of a robot 1 is determined by a user's voice and actuation. Then, the actuation 
which recognizes a user's voice is first explained with reference to the flow chart of drawing 10 . As for a 
user's voice incorporated with the microphone 15, processing of speech recognition is performed by speech 
recognition section 3 1 A in step SI. 

[0065] The recognition result outputted from speech recognition section 31 A is inputted into the text analysis 
section 71 of the action decision mechanism section 33 in step S2, and text analysis is performed. And in 
step S3, keyword matching is performed by the keyword extraction section 74 using the result of the analysis. 
Consequently, it is judged in step S4 whether the keyword was extracted or not. In step S4, when it is judged 
that the keyword was extracted, it progresses to step S5. 

[0066] Let the extracted keyword be language information in step S5. On the other hand, when it is judged in 
step S4 that a keyword is not extracted, it progresses to step S6 and let information that there is no keyword 
be language information. Termination of processing of step S5 or step S6 outputs language information to 
the table reference section 76 of operation in step S7. Such processing is repeatedly performed, while the 
robot 1 is operating. 

[0067] While such speech recognition processing is performed, processing about a user's image is also 
performed. The image processing performed in a robot 1 is explained with reference to the flow chart of 
d.awing_n • As for the image picturized by CCD 16, characteristic quantity is extracted by the 
feature-extraction section 62 of image recognition section 31 B in step SI 1. The recognition result is used and 
it is judged in step S 1 2 whether there is any gesture registered. That is, it judges whether the matching 
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section 65 has a match in the pattern information on the gesture memorized by the standard-pattern database 
66 using the characteristic quantity outputted by the feature-extraction section 62. When it is judged by such 
decision that there is gesture, it progresses to step S 1 3. 

[0068] In step S 13, it is judged whether the gesture judged to be gesture is a thing with incidental 
information. As gesture with incidental information, it is a case so that a user may require a direction 
predetermined with a finger very, and, in such a case, the information on an object that it is located in the 
direction in which the finger is present very turns into incidental information, for example. In step SI 3, when 
it is judged that it is gesture with incidental information, detection of the incidental information is performed 
in step S14. In step S14, after detection of incidental information is ended, it progresses to step S15. 
[0069] When it is judged that there is no gesture registered in step S12 on the other hand, or also when it is 
judged in step S13 that there is no incidental information, it progresses to processing of step S15. In step SI 5, 
performance information is outputted to the table reference section 76 of operation. 
[0070] When it progresses to processing of step S12 to the step SI 5, as performance information, they are 
the information that there is no gesture, and the information that there will be no information which opts for 
actuation as a recognition result of an image if it puts in another way. When it progresses to processing of 
step SI 3 to the step SI 5, as performance information, it is only the information about gesture. When it 
progresses to processing of step S14 to the step S15, as performance information, they are the information 
about gesture, and incidental information. 

[0071] Such image recognition processing is repeatedly performed, while the robot 1 is operating. In addition, 
the measurement result outputted as incidental information on step S 1 3 as a result of processing by the face 
detecting element 63 and the distance test section 64 is also included if needed. 

[0072] Thus, the table reference section 76 of the action decision mechanism section 33 of operation opts for 
a robot's 1 action using the language information as a speech recognition result, and the performance 
information as an image recognition result. With reference to the flow chart of drawing 1 2 , actuation of the 
table reference section 76 of operation is explained. In step S21, performance information is inputted for 
language information from image recognition section 3 IB from the keyword extraction section 74, 
respectively. In step S22, actuation is determined as a meaning based on the language information and 
performance information which were inputted with reference to the table of operation memorized by the 
table storage section 77 of operation and the classification table of operation memorized by the classification 
table storage section 77 of operation. 

[0073] Here, the actuation for which it opts is explained, although it opts for actuation based on the table of 
operation shown in drawing 8 , the recognition result (performance information) of an image is "beckoning", 
and when an audio recognition result (language information) is the "** here **", as actuation, actuation of 
three kinds of the user approaching a user who separates from a user being disregarded is set up, for example, 
always performing the same actuation, although actuation of approaching a user should be chosen if it is 
usual, and "it is beckoned" and called the "** here **" - if - it may get bored. 

[0074] Then, even when a user does the same gesture and does the same utterance, it is made to make 
different actuation perform more each time. Then, determining [ to which it is set ] whether it is decided 
among three kinds of actuation which actuation it will be by feeling in case [ that ] the keyword which is 
determined in order, which is determined at random and which is determined by the probability value 
determines is considered. 

[0075] When a probability value determines, the rate of whether it is decided which actuation it will be is 
beforehand determined like 50% "which approaches", 30% "to leave", and 20% "disregarded." 
[0076] When a keyword determines, it is possible to carry out with the combination of current actuation, 
utterance, and the previous actuation and utterance. For example, when a user needs to strike a hand as pre- 
actuation and needs to beckon as current actuation, and it sets up so that actuation of surely approaching a 
user may be chosen, when come here is said, and knock as pre- actuation, beckon as current actuation, and 
come here is said, it sets up so that actuation of separating from a user may be chosen. 
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[0077] Thus, you may make it the combination of pre- actuation, utterance, and current actuation and 
utterance determine actuation. 

[0078] If beckon and come here is said when approaching a user and sensing the resentment, if beckon and 
come here is said when the feeling at that time determines, and sensing fear by the feeling at that time with 
reference to the information on feeling / instinct model section 32, it is also possible to make it say that a 
user is disregarded. 

[0079] Thus, the table reference section 76 of operation opts for actuation with reference to a table of 
operation based on language information and performance information. And the actuation for which it opted 
is outputted to the posture transition device section 34 in step S23 ( drawing 12 ), and a robot 1 performs 
actuation for which it opted by performing processing predetermined in the part after it. 
[0080] Although the direction which a user shows is detected from the direction which a user's finger puts 
and the object which exists in the direction was detected as incidental information in the gestalt of operation 
mentioned above, a direction is detected from the direction of a user's face, the direction which the eye has 
turned to, the direction which a jaw puts, and you may make it detect incidental information. 
[0081] Moreover, the sign which shows the intimidation and the piece sign which close O.K. sign, 
BATSUMAKU, the round-head mark, safe, and the lug other than the gestalt of operation mentioned above 
(it is not audible), ******** (a palm is &wayed horizontaIly) ^ money ^ the wi§h a prayerj & hjmd 

etc. become possible [ using ] by memorizing the information on the gesture generally used in the 
standard-pattern database 66. 

[0082] When recognizing that the user spoke, the utterance itself is ambiguous (it does not speak clearly), 
and it may incorrect-recognize as speech recognition. For example, although a user takes "apple and speaks 
with ", since the utterance is not uttered clearly, as a result of incorrect recognition of speech recognition 
section 31 A, "parakeet is taken and it may be recognized as "etc. In such a case, by using image data 
explains about how to identify whether it is an apple and whether it is a parakeet with reference to the flow 
chart of drawing 13 . 

[0083] In step S31, if a user speaks, the voice will be incorporated by the robot 1 with a microphone 15 and 
will be inputted into speech recognition section 31A. Speech recognition section 31A recognizes the inputted 
sound signal in step S32. And two or more candidates judged that the user probably spoke as the result are 
mentioned. Processing of step S33 is performed to the probable candidate of the 1st place, and the candidate 
of the 2nd place among those candidates. 

[0084] In step S33, it is judged whether the difference of the score value of the candidate of the 1st place and 
the candidate of the 2nd place is less than a predetermined threshold. Consequently, if it puts in another way 
and the candidate of the 1st place will be judged to be satisfactory as a recognition result since the difference 
of the score value of the candidate of the 1 st place and the score value of the candidate of the 2nd place is 
separated when it is judged that it is not less than a predetermined threshold, it will progress to step S37, the 
recognition result will be decided as a speech recognition result, and it will be used. 

[0085] If it is judged that the 1 st candidate may be incorrect recognition if the difference of the score value 
of the candidate of the 1 st place and the score value of the candidate of the 2nd place is judged to be less than 
a threshold in step S33 and it will put in another way on the other hand, it will progress to step S34 and let 
two or more candidates with an expensive score be processing objects as a recognition result Image 
recognition is performed in step S36. The image picturized when it was the order at the time of the image 
pictunzed when utterance of the user who is the processing object of speech recognition was carried out, or 
utterance being carried out is a processing-object image of the image recognition in step S35. 
[0086] In step S36, the complement as a result of speech recognition is performed using the result of the 
image recognition in step S35. 

[0087] For example, as mentioned above, when a user takes "apple and speaks with ", as the recognition 
result, the candidate of the 1st place takes "apple, and it is ", and the candidate of the 2nd place takes 
"parakeet and suppose that it was ". Furthermore, when these candidates of the 1st place and the candidate of 
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the 2nd place are less than predetermined thresholds, which candidate cannot judge in that of the right. Then, 

the picturized image is then recognized, for example, the parakeet which is the candidate of the 1st place 

when it is judged that the apple is picturized in the image and which "is the candidate of the 2nd place when 

an apple is taken, it judges that "is as a result of right recognition and it is judged that the parakeet is 

picturized in an image" is taken, and it is judged that "is as a result of right recognition. 

[0088] Thus, a complement of the result of speech recognition decides the complemented speech recognition 

result as a speech recognition result in step S37. Thus, when ambiguity is contained in a recognition result, it 

becomes possible by using image information to perform speech recognition more certainly. 

[0089] In addition, in the explanation mentioned above, although only the difference of the score value of the 

candidate of the 1st place and the candidate of the 2nd place was compared, approaches, such as taking the 

difference of the candidate of the 10th place from the candidate of the 1 st place, may be used. 

[0090] by the way, the time of User A and User B talking - User A - " - suppose that tried to be fastidious 

and it spoke with ". this utterance - receiving - User B - " - is it in it? It speaks with ". Such a conversation 

is a conversation exchanged well in every day. That is, a demonstrative pronoun changes with situations at 

that time as it is "it", if it is "this" for User A and takes to User B also to the same object. 

[0091] Such a thing is being able to say, when the robot s 1 is talking with the user, therefore a robot 1 needs 

to recognize clearly to what the user is pointing. Processing of the robot 1 when recognizing the object which 

a demonstrative pronoun shows is explained with reference to the flow chart of drawing 14 . In step S41 , a 

user speaks and speech recognition is performed in step S42 about the utterance. 

[0092] In step S43, it is judged using the result of speech recognition whether a demonstrative pronoun is in 
a user's utterance. If it is judged that there is no demonstrative pronoun, the result of the speech recognition 
will be decided as a speech recognition result in step S46. 

[0093] When it is judged that a demonstrative pronoun is in the inside which the user uttered in step S43 on 
the other hand, it progresses to step S44 and image recognition is performed. The image set as the object of 
image recognition is an image in the time of picturizing the image which judges the image picturized when 
the user spoke, or the direction which a user puts with a finger etc., and exists in the direction. 
[0094] In step S44, if the image recognition of the picturized image is performed, in step S45, the 
complement of a demonstrative pronoun will be performed using the recognition result (image information). 
Here, a concrete example is given and explained, a user - " - suppose that it swerved and spoke to "and a 
robot 1 . A user takes gesture, such as pointing to the object corresponding to "it" with a finger, in that case. 
[0095] A robot 1 receives the utterance, and performs speech recognition in step S42, consequently judges 
that "it" which is a demonstrative pronoun is included. Moreover, it is judged that the user has taken the 
gesture of pointing to a direction predetermined with a finger, from the image picturized when the user spoke. 
[0096] In step S44, a user judges the direction where it pointed with "it", and picturizes the image of the 
direction, and a robot 1 performs image recognition to the picturized image. If recognized as an object as a 
result of the image recognition (for example, a newspaper), the object which the demonstrative pronoun "it" 
shows will be complemented as it is a "newspaper." Thus, in step S45, if a demonstrative pronoun is 
complemented from image information, it will progress to step S46 and the complemented speech 
recognition result will be decided as a speech recognition result. 

[0097] Thus, it becomes possible by using image information to recognize certainly the object which a 
demonstrative pronoun shows. 

[0098] When a robot 1 picturizes an image, in the picturized image, two or more bodies are included in many 
cases. The body to which conversation carries out an object and the user is pointing among such two or more 
bodies is explained with reference to the flow chart of drawing 15 about the processing which recognizes 
what it is. In step S51, a user's gesture picturized by CCD 16 is inputted into a robot 1 as an image. 
[0099] When it is said by inputting gesture that the gesture points to a predetermined direction for example, 
in order to detect incidental information, it is necessary to recognize the image of the direction to which a 
user points. Then, the image of the direction to which a user points is picturized and image recognition 
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processing by image recognition section 31B is performed in step S52 about the image. The recognition 
result is used and it is judged in step S53 whether two or more objects exist in an image. In step S53, if it is 
judged that two or more objects do not exist, it will put in another way and it will be judged that only one 
exists as an object, it will progress to step S56 and the image recognition result of the object will be 
outputted as an image recognition result. 

[0100] On the other hand, when an object is judged that there are more than one in step S53, it progresses to 
step S54 and speech recognition is performed. The voice set as the object of speech recognition is the voice 
incorporated when the user performed gesture. The result (speech information) of the speech recognition in 
step S54 is used, and the complement of an image recognition result is performed in step S55. Here, an 
example is given and explained. 

[0101] While a user does the gesture of pointing to a predetermined direction, suppose that "ball was taken 
and it spoke with ". First, a robot 1 recognizes a user's gesture and recognizes it as the gesture being gesture 
indicating a predetermined direction. And the image of the direction to which it points is picturized, and the 
object m an image is recognized. Consequently, if it is judged that two or more objects exist, voice which the 
user uttered to gesture and coincidence will be recognized. 

[0102] If "ball is taken and it is recognized as "as a result of the speech recognition, the "ball" of them will 
be judged to be the object which the user is considering as the request also in two or more objects in an 
image. That is, an image recognition result is complemented from speech information. Thus, if an image 
recognition result is complemented from speech information, the image recognition result progressed and 
complemented will be outputted to step S56 as an image recognition result. 

[0103] Thus, it becomes possible using speech information to acquire image information with a more high 
precision by complementing the ambiguous part of image information. 

[0104] By the way, the robot which acts and takes action only by image information that the robot which 
takes action only by speech information advances in the direction in which there is a user's voice performs 
action called ** in the direction in which a user is settled in the image currently picturized, for example 
However, as mentioned above, the robot 1 which applied this invention judges the actuation for which the 
user is asking combining speech information and image information, and actually shifts to action. Then as it 
already explained that a robot's 1 action was classified, and shown in a classification table of operation as 
shown in drawing 9 , it can classify. 

[0105] That is, it opts for actuation by recognizing voice and grasping the location of user and robot 1 self 
from image information. When a user speaks with "come here", first, speech recognition of the utterance is 
earned out, next, specifically, a user's location is recognized from image information. And when it opts for 
actuation of progressing in the direction of a user, the purpose location in which direction it progresses is 
determined in the distance of which. 

[0106] For example, as shown in drawing 16 , as a purpose location, it is set up with the place 80cm before a 
user. The distance between selves is measured with a user using the characteristic quantity from which the 
feature-extraction section 62 ( drawing 6 ) of image recognition section 3 IB extracted this based on the 
magnitude of a user's face which the face detecting element 63 has recognized a user's face, and has been 
recognized by the distance test section 64. And it is determined which should progress in order to move to 
80cm of a user's this side using the measured distance. 

[0107] Thus, it becomes possible by measuring a user's location and using a user's location according to 
actuation to make actuation to a user's gesture more exact. 

[0108] In the gestalt of operation mentioned above, although the language which actually spoke as a user's 
voice was recognized, a user can opt for actuation of a robot 1 as voice using the sound (rhythm) made with 
the handclap, and a user's footstep (sound). 

[0109] When using a user's rhythm and sound (a thing including these is hereafter expressed as voice 
suitably), the configuration of speech recognition section 31 A turns into a configuration as shown in drawing 
17 . That is, the voice incorporated with the microphone 15 is inputted into the AD translation section 51 is 
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changed into digital data and is further inputted into a rhythm / sound recognition section 81 . A rhythm / 
sound recognition section 81 acquires the information about a rhythm or sound. 

[0110] The recognition result acquired by a rhythm / sound recognition section 81 is outputted to the action 
decision mechanism section 33. In addition, to drawing 17 , the part which recognizes a user's utterance, i.e., 
the part shown in drawing 5 , is omitted and described. Therefore, the digital sound signal outputted from the 
AD translation section 51 is outputted also to a rhythm / sound recognition section 81 ( drawing 17 ) while it 
is outputted to the feature-extraction section 52 ( drawing 5 ). 

[01 1 1] Furthermore, although the recognition result outputted from a rhythm / sound recognition section 81 
is outputted to the action decision mechanism section 33, it is directly inputted into the table reference 
section 76 of operation rather than is inputted into the text analysis section 71 ( drawing 7 ) of the account 
posterior part 33 of action decision. 

[0112] Here, the recognition approach of the rhythm which a rhythm / sound recognition section 81 performs 
is explained. A rhythm is detected using approaches, such as beat (beat) detection of a rhythm / sound 
recognition section 81 percussion-instrument sound (the sound by a user's handclap is included), or beat 
detection by code (chord) change. Consequently, the beat was detected when or how many beat child and the 
detection result of several beats per minute is outputted. 

[0113] The sound-source separation system for "percussion instrument sound about the approach of rhythm 
detection, Goto true **, Yoichi Muraoka work, the Institute of Electronics, Information and Communication 
Engineers paper magazine, J77-DII, "NO.5,901 thru/or 91 1 pages, and 1994", the real-time beat tracking 
system for an acoustic signal, It is possible to use the approach currently indicated by Goto true **, Yoichi 
Muraoka work, the Institute of Electronics, Information and Communication Engineers paper magazine 
J81-DII, NO.2,227 or 237 pages, 1998", etc. 

[01 14] Using the recognition result about the rhythm outputted from a rhythm / sound recognition section 81, 
the case where it dances as actuation for which the action decision mechanism section 33 (table reference 
section 76 of operation) opts is mentioned as an example, and is explained here. The table of operation as 
shown in drawing 18 is memorized by the table storage section 77 of operation, for example, the recognition 
results about a rhythm are 0 thru/or 60 beats in 1 minute, in the case of two rhythm, dance A is chosen, and 
in 1 minute, when it is 0 thru/or 60 beats and is not two rhythm, three rhythm, or four rhythm, either, dance 
A chooses - having - ****** - like - ** for 1 minute, and several beats - ** - the type of a dance is 
chosen as a meaning by the information to say. 

[0115] Thus, a robot 1 is controlled by performing processing predetermined in the part after the action 
decision mechanism section 33 so that actuation for which it opted by referring to the table of operation 
where the table reference section 76 of operation is memorized by the table storage section 77 of operation is 
performed. 

[01 16] Although the information about a rhythm was acquired with voice, you may make it acquired by 
gesture in the explanation mentioned above. When the information about a rhythm is acquired by gesture, the 
configuration of image recognition section 3 IB is good with a configuration as shown in drawing 6 . It is' 
possible to use recognition of the sensibility expression by "gesture, Seiji Iguchi work, and the approach 
currently indicated by 17 Robotics Society of Japan No. 7" as an approach of acquiring the information about 
a rhythm by gesture. 

[01 17] Of course, you may make it acquire the information about a rhythm from both voice and gesture. 
[0118] Next, the case where sound opts for actuation of a robot 1 is explained. It is shown whether they are 
sounds which who or what emitted, such as a sound in which the favorite person emitted a scream etc. and 
what kinds of sounds [ a footstep and ] they were as a sound recognized by a rhythm / sound recognition 
section 81 again, for example, a sound which the disagreeable person emitted, and a sound which the vehicle 
emitted. 

[0119] The result recognized in a rhythm / sound recognition section 81 is outputted to the table reference 
section 76 of operation. The table reference section 76 of operation opts for the actuation corresponding to 
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the recognition result about the inputted sound with reference to the table of operation memorized by the 
table storage section 77 of operation. An example of the table of operation about the sound memorized by the 
table storage section 77 of operation is shown in drawing 19 . 

[01 20] as a table of operation shown in drawing 19 , for example as a recognition result of acoustic, a 
footstep is detected, and if the footstep is judged to be a favorite person's footstep, actuation that joy 
approaches will choose - having - ****** - it is the table where action is determined as a meaning 
according to a sound phenomenon like. A robot 1 judges the information of a favorite person and a 
disagreeable person from the conversation exchanged between a user and a robot 1, a user's attitude, etc., and 
you may make it memorize it as information. 

[0121] Moreover, you may make it use not only sound but image information. That is, although it is also 
possible to judge whether someone came from the footstep when a footstep can be heard, it is picturized as 
an image, and the man judges whether you are a favorite person and whether you are a disagreeable person, 
and may be made judge who it is and to opt for actuation from the recognized result. 
[0122] As mentioned above, by combining speech information and image information, it becomes possible 
to make various actuation give a robot 1, and it becomes possible further by using mutual information in the 
phase of recognition of the voice in the decision of operation, and an image to perform institutional high 
recognition processing more. 

[0123] Although a series of processings mentioned above can also be performed by hardware, they can also 
be performed with software. When performing a series of processings with software, the program which 
constitutes the software is installed in a general-purpose personal computer etc. from a record medium 
possible [ performing various kinds of functions ] by installing the computer built into the hardware of 
dedication, or various kinds of programs. 

[0124] As shown in drawing 20 , this record medium is distributed apart from a computer in order to provide 
a user with a program. The magnetic disk 131 (a floppy disk is included) with which the program is recorded 
an optical disk 1 32 (CD-ROM (Compact Disk-Read Only Memory) -) DVD (Digital Versatile Disk) is 
included. It is not only constituted by the package media which consist of a magneto-optic disk 133 (MD 
(Mini-Disk) is included) or semiconductor memory 134, but It consists of hard disks with which ROM1 12 
with which a user is provided in the condition of having been beforehand included in the computer, and the 
program is remembered to be, and the storage section 1 1 8 are contained. 

[0125] In addition, in this specification, even if the processing serially performed according to the sequence 
that the step which describes the program offered by the medium was indicated is not of course necessarily 
processed serially, it is a juxtaposition thing also including the processing performed according to an 
individual. 

[0126] Moreover, in this specification, a system expresses the whole equipment constituted by two or more 

equipments. 

[0127] 

[Effect of the Invention] Since voice is recognized, an image is recognized and it opted for actuation of a 
robot among the audio recognition result or the recognition result of an image at least using one side 
according to the information processor according to claim 1, the information processing approach according 
to claim 10, and the record medium according to claim 1 1 like the above, it becomes possible to perform 
speech recognition and image recognition with a more high system. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

L ^-- awin " ' I 11 is the Perspective view showing the example of an appearance configuration of the gestalt of 
1 operation of the robot which applied this invention. 
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[Drawing 2J It is the block diagram showing the example of an internal configuration of the robot of drawing 

[Drawing 3 ] It is the block diagram showing the example of a functional configuration of the controller 10 of 

drawing 2 . 

[ Drawing 4] It is drawing showing the example of a functional configuration about the part which recognizes 
voice and an image and opts for action. 

[ Drawing 5] It is the block diagram showing the internal configuration of speech recognition section 3 1 A. 
[Drawing 6] It is the block diagram showing the internal configuration of image recognition section 3 1 B. 
[ Drawing 7] It is the block diagram showing the internal configuration of the action decision mechanism 
section 33. 

[Drawing 8] It is drawing explaining the table of operation memorized by the table storage section 77 of 
operation. 

[Drawing 9] It is drawing explaining the classification table of operation memorized by the classification 
table storage section 78 of operation. 

[Drawing 10] It is a flow chart explaining speech recognition processing. 
[Drawing 1 1] It is a flow chart explaining image recognition processing. 
[Drawing 12] It is a flow chart explaining decision processing of operation. 

[ Drawing 13] It is the flow chart which explains the processing in the case of outputting a recognition result 
using speech information and image information. 

[Drawing 14] It is the flow chart which explains other processings in the case of outputting a recognition 
result using speech information and image information. 

[Drawing 15] It is the flow chart which explains the processing of further others in the case of outputting a 

recognition result using speech information and image information. 

[Drawing 16] It is drawing explaining the physical relationship of a user and a robot 1 . 

[Drawing 17] It is drawing showing other configurations of speech recognition section 31 A. 

[ Drawing 1 8] It is drawing explaining other tables of operation memorized by the table storage section 77 of 

operation. 

[Drawing 19] It is drawing explaining the table of further others of operation memorized by the table storage 
section 77 of operation. 

[Drawing 201 It is drawing explaining a medium. 
[Description of Notations] 

10 Controller 10A CPU, 10B Memory 15 A microphone and 16 CCD 17 Touch sensor 18 loudspeaker 19 
The communications department, 31 Sensor input-process section 31A Speech recognition section 31B 
Image recognition section 31C Pressure processing section 32 Feeling / instinct model section 33 The action 
decision mechanism section and 34 posture transition device section 35 Controlling mechanism section 36 
Speech synthesis section 



[Claim(s)] 

[Claim 1] The information processor characterized by including a speech recognition means to recognize voice, an 
image recognition means to recognize an image, and a decision means to opt for actuation of said robot at least' 
using one side among the recognition result by said speech recognition means, or the recognition result by said 
image recognition means, in the information processor built in a robot. 

[Claim 2] The information processor according to claim 1 characterized by including further a maintenance means 
to hold the table on which the relation of said robot of operation determined as a meaning by the recognition result 
by said speech recognition means, the recognition results by said image recognition means, and those recognition 
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results was described. 

[Claim 3] Said decision means is an information processor according to claim 1 characterized by deciding that it 
will be a meaning and opting for actuation of said robot using the determined recognition result using the 
recognition result by said image recognition means when the recognition result by said voice means is not 
determined as a meaning. 

[Claim 4] Said decision means is an information processor according to claim 1 characterized by opting for 
actuation of said robot using the recognition result determined by the meaning in said image which said image 
recognition means recognizes using the recognition result by said speech recognition means when two or more 
objects exist. 

[Claim 5] Said image recognition means is an information processor according to claim 1 characterized by 
recognizing the image which detects the direction to which the part beforehand set up among a user's finger, the 
face, the eye, and the jaw points, and is located in the direction. 

[Claim 6] Said image recognition means is an information processor according to claim 1 characterized by 
detecting the gesture memorized by said storage means and making the detected result into a recognition result by 
recognizing said user's image, including further a storage means to memorize the data about the gesture which a 
user performs. 

[Claim 7] It is the information processor according to claim 1 characterized by also using the measurement result 
by said measurement means for said decision means, and opting for actuation of said robot from the magnitude of 
said user's face detected by detection means to detect a user's face, and said detection means, including further a 
measurement means to measure the distance between said user and self. 

[Claim 8] Said speech recognition means is an information processor according to claim 1 characterized by 
detecting the rhythm contained in an environmental sound and considering as a recognition result. 
[Claim 9] Said speech recognition means is an information processor according to claim 1 characterized by 
detecting a sound phenomenon and considering as a recognition result from an environmental sound. 
[Claim 10] The information-processing approach characterized by to include the decision step which opts for 
actuation of said robot in the information-processing approach of the information processor built in a robot at least 
using one side among the recognition result by processing of the speech recognition step which recognizes voice, 
the image recognition step which recognizes an image, and said speech recognition step, or the recognition result 
by processing of said image recognition step. 

[Claim 1 1] The record medium with which the program which the computer characterized by to be included the 
decision step which opts for actuation of said robot at least using one side among the recognition result by 
processing of the speech-recognition step which is a program for information processing of the information 
processor built in a robot, and recognizes voice, the image-recognition step which recognizes an image, and said 
speech-recognition step, or the recognition result by processing of said image-recognition step can read is 
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e^fi, ^IMg|53 1 Afc«fcDS;S|8H*fi« 0 CC 

d 1 6Kj:e)ii«snfea.— tros?* me 

I»8I5 3 1 B(c£DI2fi£n3 0 fr»j*^«#iaJ3 3t4, 
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m*«2] MiE^^^isictsra^, Mia 

c twat^rsas^s i £§eie©mmmb 0 
*£«-r § i2«#@* # s t ^ 

HfliEB#i5»TO«:, Mffi.3.— If ©®{££KIl§&-r§ c £ 
aafEBt«I8fcl\ Mie^^lgfc «fc 3$J£*£3I t ffl^ 

82**fcmu t -r s c t t?%> mm\ 
i t,iE^ot»fflraaaa 0 

[it*® i o] dsj?v htcrtiasnsftififflssso 
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2 

tamgpmmz x v fommic & s bibbs, s ^ a, 

MfEBfilglST. x y 7©fltfIlc £ § egngjg£!> a . '> 
Mi 1] d * v h fc rt* £ ft 3 'ft 

Bp*mm? % mpmmx x ? 7 s <t , 
H«*gi*r § B$gau x -v y t , 

ffiiEMBtt;* f v 7(Dtimiz J; 5 ISiilSH© -5 % , '> 
[000 l] 

[fMlOJR-r «atfi»»] *^fi:1f IMI^J; tf 

So 

[0 0 0 2] 

fcOMtr) Jf», rt— »f©«|g*BBL, 

^^<^p a p{t?ftT^§o 

[0 0 0 3] iSfi^ML, Z<DM®LrcW®% 
[000 4] 

[0005] ±aLfca#y mpMfcam 

40 »©-73©«$B©*tfiS^LTS^W4I&^%fT3 1© 
[0 0 0 6] *fEWttc©J:54ttiHfc:«*T4*nfc 

[0 0 0 7] 



3 

ft a, mmmzsac «t § igifjssjg© 5-5, '>& < 1 1 

Co o o 9] Mte^^is^, ^wiatis^ae* 
roo i o] weajt^aa:, Bftsn^isAwrs 

Jfll^T, P*-y SJ:3t-r*ct*^ 

[ooi i ] nmmmm^mn* a.— »r©jg, 
ttwu *o»iRiJ«:fi[H-r«ii«*Ka«-r*«ta»£:-r* 

[0 0 12] a— !foffo^xX-?-frtc^-rsr-^* 
<f©B«*ffi«f 5 c t {<: «fc D > SBttf©K:iE«S ftT 

[0013] 3-—*f<Dffi*ftiii?ztfimm£, mm 

[0014] mm^mm^mz, mmmz^znzv 

[0015] mm^pmwi^mit, mm^frz, %wm 
[ooi6] it^i oizmmv^mmyjmt, %p 

s^tt, sfissai^ rvrorat «t 2 toss© 

[0 0 1 7] Mf$£l lfclB«©Efi«(*©^Dy7A 

raisin, sfctt, ssi«ia!3ix7 L 'v^©«ig!t«fc*jBa« 
[ooi8] mm. i tiEi!i©is«M^s, at«g i so 



wmmas^xiz, wmmztu wmtmm&ti. 
%*<Dfflm&, sfttt, ra©8QK££©3«. 

[0 0 19] 

[0 0 2 0] *»©J^^T?(i, njj?y M4, ^tf© 
10 t>©i:£nTfc»K Hftati-'y f«2©W«fefetrfc:, * 
n^niJgprL--y h 3 A, 3B, 3C, 3DtflBSn 

[0 0 2 1 ] MMgPa-y h 5tt, B#att-v h 2© 

h 2 \Z P # 7 h ±f*©$lJ^I^|T 93>hD-7l 
0, P#>y h©fitfjjgfcfc5^x'J 1 1 x 

20 -T-U-fey-ti 2fc«fctfaJH:;/-g-i 3frSft*rtaHr>tf 

[0022] m^=i-y bAiat, mi \c#mtz^ 
(v^^P7^-» 1 5, r@j jcffl^-rsccDCch 

arge Coupled Device) # ^ 1 6, tejgMBS?" 
^-b>tt- 1 7, Tpj Jcffi^T^Xtf-* 1 8* W 

[0 0 2 3] KPSPn- -y h 3 A 7^3 D^tl^tl©^Ii5 
SS^, WSBa-«v h 3 D^ft^ftfcflBttgpn 
H 2©aSBffi^, £igpa-<y h 4 tP^gga--y h 
30 2 ©M^g|5», JfitffcKSto-y h 5 ilftSfa^ 

7*5^x-*3 AA,7^M3 AA«, 3BA.7!»M3B 
A., SCA.T'jSSCAk, 3DA.M3DA., 4A, 
nm4A^ 5 A. fc± tf 5 AzffEK*tlTfc*>* en 
§M^H, ^©gS«*toT|sl4£-r-5 

[0 0 2 4] algprL--y h 4icfcit5vY^ i a 
— »f^6©«e*^ty|8B80S|5 (^) f#S 

nfcww*, oicmtutZo ecu* 

^71 6*4, ^H©«jjE*««L, mztitzmmim 

[0 0 2 5] 2-y^-fey-y-i 7(4, ffUtf, gS«Ja- v 

p— 5 1 ot^aj-rso 

[0 0 2 6] Kft3S:i-'y H2t«itJ-S/<^-r»J-fe>-9- 

1 2{±, /^f'j i i©ai»*ttmu *©*imtgiR 

^'vr'J^fi^HifI^tLrp>hP-vl Olcm 



5 

[0 0 2 7] 3>hn-^l o«\ CPUCCentral Proces 
sing Unit) 1 OA^^U 1 0 B3F*rtKLT*J»), CP 

ui OAtcfc^r, pcqey i 0Bfc:ffitt*tifciW»:/n 

3>hP-7 1 Oti, V-T^ 1 5^, CCD*p<-5 1 
1 3fr5^*Sft*WflH§, BUM*!, E^tttHfil 

[0 0 2 8] 3>hD-7lO(t COWKfJg 

7^fal-^3 AA,7!jM3 AA,, 3 B A, 
3CAi71rM3CA«, 3DA,7!;M3DA 
4Ai7iM4Ai, 5Au 5 A 2 <Z>5 50&S&fc<D 
fcffiftStf. <HU:J:*K M*±T&&£ 

H3A7!»S3D*jffiI&LT, h*£fr*-fr« 

[0 0 2 9] Sfc, n>hn-7l0f±, MKJSi; 

LED (Light Emitting Diode) *^T, flWTSfcttjSuB 

[0030] 0±O«fc3k:LT, D#.yMi, IUH©tt 
H3Ftc»3^T i »WJcff»* i: 3 c £ T- 1 -S J: 5 tc 

[00 31] 0314, IH2(Dn>ha-7 1 0© 

fiUtt, cpoi oa^\ ^iji oBicmmzntcftmy 

So 

[0032] 3>hn-7i oti, «s£on»tt»*ai 

SM/^tEtfVl/gPS 2, -b>-9-Atl5aiIg!5 3 1 «D|g 

§153 3, tr«is»sawpiaj3 3©i*sigak:S"cj^r, n 

&7 * * a x- £ 3 A A , T^m. 5 A , *i ct XI 5 A , £fg®j 

sjffli-rssijfflaMSgp3 5, z^js&ftgn 

[0 0 3 3] *>tfA^iM5|53 l It, 7^>/i5^ 
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^'lf/*tlt-r;l/gP3 2fc«fctffr»ft>£a«g&3 3 

[0 0 3 4] gpis, Aortas* 3 1 gpmm 

®3 1 AfcSLTfcD, ^S^g|5 3 1 AH\ ffilttS 
««SP3 3*^<Di&J1iiItLfc*^\ V-Y^i 5*»6#^. 

ra^3 1 Att, fo^BwsatLT©, m 

10 «3 2fe«J:tffTftaiS«B(ia53 3fc:iB3aif*. 

[0 0 3 5] gfc, -b^A^Mag^ I fct x Sfilgf^ 
gP3 1 B**f 1/039, ifiySR33 1 Bfcj\ CCD£* 

£fr-5 0 ^lt, ®fi|gsig|3 3 1 ^o^agcDig 

**J^RKtiHBfc LT, S«/*^7^gB3 2*s«k 
tffr«i^ffl#}gB3 3tcii»Jf So £fc, a— ftffT? 

HP 3 3tii3B-r5o 

[0 0 3 6] ZZlZ, -b^A^SaiI0P3 ltt, J±^ 
Hg|53 1 C^LTfcl JE^a^gI5 3 1 Ctt, ^-y^ 

LT, ff^MSg|3 3 1 Ctt % m^OBfl 

30 KWiHBfcLT, «fl»/*fl&&-r;P«3 2feJ:rffT»j* 
^«M«»3 3Kffl»lf S. 
[0 0 3 7] JBIf /*g|€-r;l/g|J 3 2tt» U#y h<Dm 
ft t *filO««*3El-r S*«ftf;l/ i aMItx^/** 

n^nraLT^So Ifii^iia53 3it -tr>^A 
ismmtLT, &9M»imm 4tcmm?& 0 

[0 03 8] ^i»3 4 ^fb»£WiaS3 

io 3*^fla&*n*fTftm^flWBiu*-j^T, q^kd 
»a»is«*4fl)tL, cn*M»««sP3 5tdi^-r 

S^ff^C^oT, 7^aX-^3AA,M5A,fc 

«tif 5 A<*mmtzrz&mmim*%.f&L, cm 

7 x- ^ 3 A A , nm 5 A , ft £ U 5 A, t iMffi * 
§0 cn^cfcO, 7^faX-^3AA,M5A,*Jj; 

[0039] n*-y h 1 tt. a— *f0)%pti?JLZ?-* 
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3P3 1 A, a— lf©S>x** + *R«?-*;fci&fc:, cc 

d i 6tmmmm^3 1 B^m?»n, gram 3 1 
A£mmm&3 i B^e,f#^n5siws*k:«kD, ff 
m^mm^3 3 p#>^ h i vftmzikfe-r&o 

[0 0 4 0] H 5 tt, 3 1 A 0&!|l!&fl!j£« 

^"TBlT'fcSo tftDfgfSfi, v>T* 1 5fcA7JS 1( 

^M-^teSSftSftSo cot^a, a^isiigP3 i 

A© AD (Analog Digital)g»gff 5 1 t{(tt&Jft5 0 A 

[0 0 4 1] ftMfcBg|55 2*±. AD$883P5 1^6© 

7 5 3fcJ:t>*V«y^>^gP5 4t^-T-5o ft$&ffi^>V 
[0 0 4 2] -Wf-vms 4*4, *N»fi&ttlg|5 5 2fr5 

ftfcft^^-*(cs-3t, itEf^f-^-x 

5 7*jgsik:j£:i;T#!HL&««6, i 5t:A** 
ti/iais (Atj^d £imt§ 0 30 

[0 0 4 3] BPtS, Iftf;l/f-?^-X5 5(4, ^ 

T\ ff^l/tLTfi, M*(4", HMMCHidden Markov 
Model)ftir*ffl^3Cfc#-e*5o ^ff-^^-X5 

6(4, K«3t*o*wKov>T, ^©a^icKra-si* 

-X5 7(4, 5 6«il;igJ 

*§eaLfc£asiij£iffliLTv*„ cct% ^smio 40 

<tLT(4, tfij*(4\ (CFG) V, iff^S 

mwmmmm (N-gram) ^(cs^kw^^c^ 

[0 0 4 4] V^>?-g|5 5 4(4, SHS'T-*'*-* 5 

(C, v-v^>^"g[55 4 (4, «-3frO#aSt-r^*, 
x-^-X 5 7 E£tt*nfc£i££aiJ*#J!a*-<5 c 4; 
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[0 0 4 5] V>y^>^g|J5 4(4, A^ftfcg 

(4, 3tiH«*nfcftai^7^-^* 

) Co 0 4 6 ] m 6 (4, isfi^iagp 3 1 b (D^m^Tjk 
?mv$>% 0 ccdi 6K£<ommi£tirmmt, mmm 

W3 1 B©ADSftSP6 1 (cA*£ft, WiunoH 

if-^tli^n, ftHW4fc£jg|$6 2(C(±}7j£ft£o ft 

f4ftbttjgP6 2(4, atj ? titmvtfr— ^ ^ nwa^ 

/W^-^$/c(4ft^^ H;l/ft£:©ftS|ff*^i6«o 
[0 0 4 7] ftM«&ffig|36 2 ft fcft®M 

(4, ««ttaJ6 3»«:tfi**nSo H^/i}gP6 3(4, Aft 

«B«»J^gB6 4tm^-rs o S§»J^g|J6 4(4, mm 
tttfc. SiOlRlt*i(l!J^So Mfcttifc 

w&&mz , ?f 3 3 iz m -h * ft 5 o 

[0048] &&% =l— vtommu:* mm. 

TNerual Network-Based Frace Detection Henry A.Ro 
wley.Shuaeet Baluja, and Takeo Kanade IEF.E Pattern 
Analysis and Machine Intel legencej iCf^^tlX^ 

[0 0 4 9] $fc, *Hi5tiO}gg|fCfe^T(4, 1 ^(D 

^xU«^60 33J?7cft«Ottfflk:BBLrtt 4 

(4*, r^3. 3. lBXJ>b^Z-y?y*y?w{$i 

3CtA^T*^§ 0 

[0050] raattwa$6 2fc,fc»)»fflsftfc#affl 

(4, ttttmaJ6 3ttfi*Stl5^-C % V>y^>7"g(56 
5(cttB73^n§ 0 v<v^>^gP6 5(4, A^fn/ii 
M«4:, M 0 ^^->x-^-<-X 6 6 (eifait^tiT^ 
«^*->fl[IHi:*Jt«r«cfcfc:«fc»)»snSB«IIS 

**h»ttsa«ias3 3\ztoti~$z> a mm^->7 £ - 

^^-X6 6(C l ; £t£i^nr^§T f -^(4, i/*xX^ + CD 

^xX^^ ) lg;]fc|ii]LT(4, 0IJ^(4", r->iXf 
+ 1 «t 5 Jg£$£4)flai 0 $ a si? -v h ¥^^Vo l . I 7 
NO. 7 9 3 3M7iS9 3 6M, l 9 9 9 J (C^ 
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[0 0 5 1] i:oj:-5fc:»B«8P3 1 AfrZUtfjZti 

ttmmtem. mmmm3 1 B-frzmt&tut 
na Q 0 7 a, fT»^««giJ3 3ortgB«fiE*^-r0 

frf&ifc£8MI|gP 3 3©r*7. MSHffgB7 1 
?t!So x**h8MJrSB7 1 14, Sf|f-^^-X7 2 
£fl¥#Tffl£i£T-#^-X7 3£fE1t£ftT^Sx-* 

fl?*/r& H©»#f*fT 3 c i: fc «fc ♦) , «©fit*B^*§£© 

[0 0 5 2] ^ff-M^7 2l:ii, ^ 

t>, Mffl^&r-^-X 7 3 tcfi, gjsas^-^- 

X7 2iciatt*nT»/^s#*ffi©m«*»»u, «3§&& 
[0053] ^ffl^}j7 i -^^-x7 3taeii?nT 

BWSfok S-P«^«>S«^|4HPSG4 if© 

[0 0 5 4] hfi?#fSP7 1 *^ffi**tlfc«?«flS 

*ti, +-7-KaaiSP7 4tcaiti*ns„ *-7-k 

&IfcBg&7 4f4, X**tlfc/S?«flgj|3jp5, 
-#^-X7 5£fB1I£nT^3^-££#BgLT, 3. 

7 5 tea, *-7- K^sHy f ^Wcf ^5n5 

[0055] mwmmm7 e +-7- mm® 7 
«gP7 7 ta^^i2tsgP7 8t, *n^nia*«n 

8*4, iEWMMtf|57 7fc,12ffi£ftT^S!I»ft«©--«l 

EiT&a 0 

[0 0 5 6] llffljOffiSMejRfcLT" ¥fift" , " JH£ 

m" , - g^- , ^£jga" , mosa&gfir 

i:offi*oill!l£IS**«#^ttiHi: LT^tftoT < a f 
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[0 0 5 7] ffil^jf, a— if3yH**»Lfc|g«i: L 
*\ £f©< 5^ttftTVS©frft£f©1f$R, -r&foS, 

ff»t" &£-££tu4\ *' n— tf^entna" tegj^ft 
io £?na 0 ftfe, MttftiS-ra^ " co^t*^" 

t =L—*fimm L , & <f\ a— iftifi^ < ^ 

[0 0 5 8] d©*^, ftratt, a— VtDitxX* 
+ CiSAOKm^) ^ a-1f©?gfS (^©tSft*£ 
*) . 2 5.1c, #i»fc:«fc»):i— «f£©E« 
t^-5 3 0©1f«tj;»), 100if^MJtl5J;5 

[ o o 5 9 ] m 9 mmmmmm 7 s ttii^n 
\t" , *5«t tf *©flr T-$.a 0 

[0 0 6 0] " nsff-y HftHffljsfilff" 14, ntf-y h© 

1 trt— ♦fA^ffiLT^Sfc^li. a— *f©M'Ji:^5 
©f4, a.#^ h 1 ©£§Jlc&a©T\ *SmtLT, a 4? 

30 >y h i a, tz^temtzttzmftitn?,, 

[0 06 1]" a-1fffiHffi*H»fr i4, a— ifO^ffi 

t4\ a-W Co-BK*^" fcfSKLfc«^ P^>y 
h 1 {4, a-+f©^M8 0cm©i:ca$TlT<©tt4, 

[0 0 6 2]" »3#fi5[iBftf^" {4, h 1 ^a-if 

©fglS^" m^tf*^- T?*ofc«^, P^«y h 1 (4, 
M t ^ o ©{4, g a©fiafc4; tf a— tf 

L{c^^na-s©^(Ri4©r*, Z(Djjfaic®mt2> 

[006 3]" {4, ^fRl^SIH©^^^^^ 

L4l^fi^T'St>, C8Jx.t4\ P^«y h l A^^rttS-r^if 
T'^a o 

[0 0 6 4] a^<y h i tfct^TfTtona, p# 
'v h i ©®i^©^©tt^{c-Q^T,iBjj-ra 0 ±a - Lfc 



11 

§&©MjW&£fts 0 

[oo6 5] ^igiigp3 i A^^tats^tircmm^ 
xh»«fgB7 ncAT^n, x*x niw^fftjnso 

[0 0 6 6] 7T«yyS5ttJ^t,'ttlUSnft+-7 

**7£ni>£, Xf-^S 7t^T, §I§fif$B#, Si 

•y h 1 #»fl?LTV«H, «DigL?Tfan«o 
[0 0 6 7] lCJ: -5 4^^ffi««Bl3Wftons-^r 

h i ia3^Tftt>ti%mmmimz-D^T* m i i ©7d 

-^■fr-h*#B9LTHtt^-r5 0 Xx-y^S 1 l fcfe^ 

t, ccdi 6ic£t)mm-£tircmmii. mmm3 1 
b & 2 iczvwMmtmmtiZo torn 

mmzktfm^zti, xf^/s 1 2K^x. mmzti 
r^zi/jLtt*wto&fr$frtji$MZti& e w*>, v 

T^S^X^*©^*-^^©^ -grTSfc© 

X**jWfcSfcipJ»r<*J'ifcS^ Xf77"S 1 3lCii 

[0 0 6 8] Xf77"S 1 3K«^T, yxXftt'S 
5i:»ti/iyxXf+ , f5f«fl»f8*J^o fe ©T*£ 

tj±, mar, a-«t«o^jLti^j; 
^riRitttH-r ^'iffs^^-So xf ■ 

1 4l03l^Tfrt>tl2> 0 Xf'^S 1 4tl5^T, 
[0 0 6 9] -75, Xf7 7S 1 2t*5^T, 8S£ft 

ti^^xxftfi^^iiiijn^ xti, xr 

£>, Xf7^S 1 5©MflK:jiiy 0 XT7/S I 5fc:fe 
[0 0 7 0] X^-y^S 1 2fr£, Xf7^S 1 5©&l 50 
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mizm/otim^ mmtbz^ ^xxrtiia^ 

^*"S1«Bttft^fcl*3ft«T&*„ Xf77'S 1 3 
^6Xr-y7°S 1 5©«yilcJtA//£*^ I&fftlWHfcL 

1 4^Xf7ys 1 5©«yifcJiA,;E«^ Ibft'HSg 

[0 0 7 1 ] COJ:-5ftiO«ffi«ffl31H:, h l 

1 3©tf3HlHHfcLTH\ maia:6 3fc«fctfSg«S!l£ 

[0 0 7 2] COi^tLT, t^KWSS^LTOW 
»J$£$!#|g|5 3 3 ©fl^fM 7 6 tt, P# y M O 

2 1 tfc^T, ^-7-Fttffigp7 4^p», mmmm 
20 mem 3 1 BfrzmimmK znztixhz 

ft3„ Xr77"S2 2tm, A^$n^cH^1f$g<t 
ftff««*»t, M&f^«32ffi9P7 7tCi2S*tlT^S» 

m fc«fctfs&^fi*iEtta5 7 7»ciBtt«nT^«» 

[0 0 7 3] i*S*ns»^k:o^T«wr 
§o ami, H8fc:^Lfc«[^fc:l6rJo^j*^*ti* 

t?, ^©iMfsm (Sigfitig) tf- co -5* 
30 £>gtns, a-+f , ^i-r§©3iit)cD®i^»^?n 
stonntf, a— ifk:ifitj< t^ommmmn^ 

[0074] *£T\ a— •f^rai;j;i^ + *u lei 

k t .t d tt^-r « , * ©Bt ©sit t cfc »> m^r 
[0075] mmmfc£K)&k%.t&t%^ •• j£-j-<" ^ 

5 0%, " litlS" #3 0%, " ftS«f*" #2 0%£ 
^LTfc<„ 

[0 0 7 6] *-7-Kt«t0»SefS«^ fflfi©3» 
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T*5<„ 

[0077] c©j:5fc, M©iSft\ jg^i:, H£©®j 

ro o 7 si *©«ro««»i:j:»)}*^-ra^ sms 

C0 0 7 9] C©«fc-5tCLT, Af^#i»7 6tt, W 

1 2) fcfel^T, «MM«gP3 4fc/iJ^$n, *n 
W»©ffl5»-P3fS©fflS!A<fir*jn« £ t =t ») , n#-y 

[0 0 8 0] ±MLfcHJg©m^fc^T«, a— |f© 

fc#, n.— tf©^©^^, ecx^tvs^Gu Si©? 

[0081] sfc, ±T&Lfcmm<D&M<Dmc. ok-9- 
c*ftt/\) , ^ tr-xit-rx (^©o-e. 

fc^ 7Jjtft£\ -IS^ffll>P>nT^-5i> , xX^-v©-fi 
[0 0 8 2] a-1f©BKL;fcCfc 

Gi^tot^sstirfcs-f) , 

a— if#" ijvi^ot" fc»aSLfc#, *©«3Stf 

3 1 A0££ai0!£& " ^>3*SoT" ftiffc&sB 

I > S C £ «fc t> , U > rift© W > 3 ft ©^figij-r § 
^tc-D^T, Si 3©-7P-^ + -h^#F.SLT!^ 

[0083]Xr-v^S31 tCfct^T, a— !f tflHg-r 

;v,- ; ii„iil;y': :< l A^A^?ti-S 0 tfJ ! 'OTtflS3 l 
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[0 0 8 4] ^f7^S 3 3tCfc^T, aHfiWXfeHfc 

* 2 ffio^JS t ©x 37ioM BfS©Hfiiwrt-e* 

3Tfflt, m2fe©^M©X37M<h©M*WnTi/^ 
$ l ffi©£*f*IgiiUg$£ LTraSftt^JpJWf* 

nat, xf'^s 3 7icji*, ^©isitMigtf, w 

[0 0 8 5] -77, Xf-^S3 3ttVt, £ Hi© 
10 ^i©xnrf®i:m2fii©^i©xn7M^©M^ > EH 

tmmmr-h § sjtitt^fe t * n§ t , * r -y y 
S3 4tii*, x3 7©st>?iia©^M^, ^gsjg^t 

3 5JC*3^5®^|i©OT^jij^ T ^ <So 
[0 0 8 6]Xf77°S3 5(C«(t«iSraKOf5m« 

[0 0 8 7] ±izELfc c fco^ > a— tw 

1 ffi0«»#" U T" T-S 0 , SI 2 fi©^M 

m i ffiosaitjB 2 iiLcoimm\ ms©anBrwrtT*o 

30 lfi©^ST$S" U>dr*lRt>T" *«iEL^BIItt* 

[0 0 8 8] £©«k$fcLT»B*©£SjWii^£n 
Si, *0«^*nfc»®WSjR3b«, Xf77°S3 7 

ft3o 

[0 0 8 9] ftfc\ ±a«LfcttWk:*5V^Ttt, S 1 14© 

[0 0 9 0] tC3T% 3— »fAfci— «fB3^K*L 
rt^tf?, a— tfAtf" £nMiT" tJBSSLfck-r 
3 0 C©«fit»LTa— «f " ^nttftAT'f^ 
?" i^f3 0 £©«fcoft^ig«B'/in<:ti^T, a< 

fe, n—ff'AtiioTfi" cn" Tfefj, a— «fBtffi? 
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[0 09 1 ] £©£?&*«, =L—VtU$y h 1 tf£ 
fc&fc, n^?>y h i a, a— WfafcigUSLT^a© 

&o to limits £>g#&3 0 m^f^i^^ 

Ell 4©7P-^^-h^#MLT^"T§o Xf7-/ 
S4 HcfcW, a— WfgfgU ^©?gI££fIILTX 

[0 0 9 2] 7fv^S4 3t^T, SmttO&ll 

©^3ggS©*SHtf, 7f7^S4 6t^T, ^IS 
[0 0 9 3] 7f7yS4 3tfe^T, a— If© 

[0 0 9 4] Xt'V^S 4 4tfe^T, ffl«?nfcjS« 

©t^mm (Bttffffi) ^ffli^T, imwmoxmifi 
§0 a— «fji, " w ic^mt^timmm 

[0 0 9 5] h 1 (J, *05gK*3W\ Xf7^ 

Rjf£©# |Rj*Jg L^-T 1 1^3 i/*x + * £ o t 
[0 0 9 6] Xf7/S 4 AlOs^X, P#y h 1 fct, 

©®«*»»u mmnrcmmzMLTm®mwi*ft 
?o tvmmmmv&m. mm. mmwmwtLx 

tr mm" -cfcsfcjs^sftSo c<d^oizlx, xt 
■yys4 5^t5^r, m®m®frzm*ix%m&ffi%$ ■ 

[0 0 9 7] £©£?£, BfctitfBfcffl^acfcfcj; 
[0 0 9 8] ufty h l ffH«*|«ttLfc*&, ?©}$ 

l Tit l^k l t i> § wttttfsre* § *^ffim-r § mmic 
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^'^S5 lKU^T, CCD 1 6lC£K>mm-$tlfc 

So 

[0 0 9 9] i/'iXffA^nSLttiD, #Jx. 

tf , fo->*ixft #Bfi£©£fii*jB Lmt^rct* 

t , ® fiig^gp 3 i B^cfcs sfneHMQS!^. xt'^ 
io s 5 2 fcfeivrfrfcftSo *©aa*is«A^^6n, x 

[oioo]-^, xf^^s 5 3ias^r, nmaw 

S) tfffll^n, Xt7^S 5 5tfet^T, OfiiMSillS 

[0101] a— m^o^^jgL^t- t^a^ 

L, MfiXfttf, ^^|LSt->*iXf 
30 M<iU BfltrtO^moSK^ffa. ^©ISS, fl 

[0 10 2] *0§TO«©lfi«, " JK-^ffloT" 

IS^^i^? ti S 0 c © «fe 5 LT0^1f$g^ 5 Bra 
MttSMtNtiZttlZt* Xf7/S 5 6fCii*, ?f^$ 

[0 1 0 3] C©J;9tCLT, ^IflS^ffl^T, Bft 

[0 1 0 4] fcciT?, »1H«©*k:j:»)fTtt*Sc 
fp#y Mi, ffllAtf, ^-1f©W^^f 5^|R»cJttJ 
£ i/ * o fTfi L , iiiii^iti'i fS© * iz «t u friEft^je c f □ ^ 
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co i o 5] f£fr-i3, %p*mmu mmmmfrz^ 
-^p*-. v h i &%(D{mzteM?z>ctic£9, m 

mz, =l— m&m*m\mm^wmrz> 0 ^lt, 

£ft§ 0 io 

[doe] <M;ti±\ mi 6izfr,t£oic s mama 
lt«, a— tf'o^Ms o cm©mi:ia^*nso 
®MiisaP3 1 b ©^attjas 6 2 (0 6) irniu 

[0 10 7] Z<D£*>\Z, *-V<0M*W&*Vi\,\ 20 

[0 10 8] ±j&L1t$mtDWm\C*5^T\it* n— tf® 

f^iLT, a.-W#tt«mLfcS (UXA) 
Jf\ n— ifojse (M) £ffl(^T, P#-y h 1 CDfift 

#&^t § «t ? tc f § c t fe r* * -s 0 

[0 10 9] a-f©UXA^i (JWT, Cti 
mmt®3 1 A©«j*tt. EF l 7fc;SLfcJ:3ft#Mtfca 

xa/mh*»8 i a, vxj**mcmzmmz 

[oi io] yxA/M&agBs 1 k:* mm* tut 
mm^mt, nm^mmm^3 3izmts^n^ 0 
mi 7iat. a— !foRas*ffiaw*«», rat>^, 

H5fc:7r;LfcaB»k:BBLTtt, $BSLT3B&LTfcS„ 40 
f£oT, AD^g|55 lfrS&frSftfcTV^l/O^jSf 

fc, uxA/g«Eaiai8 i (®i7) ttajj/sn 

So 

[oiii] set, ^XA/^i(|8 l ^h.'I-,/; 

t>\ ftmk'&mm3 3<DT*xh®mwi i mi) 

tX^iSti50Ttt4<, SSL »tt»M7 6fc:A 

[0112] clt*> u XA/Miggfcsj 8 i mi v v so 



^0) WfDB 2001-188555 

18 

XAOOT^iSt-o^Tltt^f So 'J XA/^M|fg|5 

[oii3] uxA^wo^ffitcKiLTtt. rngggg 

S^fjfSBHI^tfcfclk J77-DIK NO. 5, 90171,^ 
911M, 1994J ^ r#Wm**mttL1tV7>\,*4 

Aif-hh7^+>^>xxA, mm, nmn- 

^ 'l? , lffBiifg^li3tiJ, J81-DIK NO. 2, 2277!^ 
237M, 1998J ft£fclIB**ft-CVSa8*ffl^3Cfc 

[0114] uxA/s«a«aP8 lfrstwusnfeu 

MSP 3 3 (ftff ^#Bggp 7 6 ) A«j*«t«»ff LTSS 

*>*M&m£itm\zm?xmm?z>o mwm^m^i 7 

ttt, 01 8fc;*Lfc«k3ftfttt£j^fg£nT^«o 
«*fcf, UXAtHfSBWe*^ l»B8t0^6 
OftT'feO, 2Jfl?0«^ HOAaVaR^n, 1 »K 
fc0 7!)M6 OjeT-fei?, 2JS?7:t, 3ftfT't,, 4jfi 

[0115] cciot Lxmi^mm 7 6 #i&fm 
iamgp 7 7 tffift* nrv ^ 5»ffa*«s-r * c 1 1 «t 

3 w»©»»-emso«rabWf btis c ^ iz «t 5 , p # 

•y h l «S(l'#?ti5o 

[0 116] ±aLft»WT?«, WlCtOUXAfcBI 

■r s -it $s^r® »-r s «t 9 tc l fc**, ^xxfttaos 

f§SftSJ:3lcLTt.fit\, i^x^ttiO UXAt 
BB-rsiff«*«ifflrt-SJ:3lcLfc«^, @jfi|g|ig|3 3 1 

+t«tD v xi±izM-fzmn#%m?tt&t lt«, 

[0 117] fctS3A,, »*<fctfs;xXf-fr«!)PS75F^ 

6 > u x a t gg-r s imzm* * j: ^ tc l t t, 
sna^tot^TiM^-rso 'JXA/^wgssass 1 

[0 119] Ucfc^TK«*ti 
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CO 1 2 0] Hi Q^LfciffgfcLtli. 
#t&A©£^T*&S<t¥'JJ8r£tt3i:, gtf&£afi-3< 

ftAfc^5ti$B«;, a— *f£D#>y h l t(DfflT*£t>2 10 
nS^IS-^ a— »foJiJS4^6a^'y h 1 A«PJW 

[0 12 1] Sfc, ^*£«-c?fc<, iH*1»$B*ffi^S 

*W»rU *©A###4A4©5&\ it^ftA4©fr* 
[0i22]±aLfc«k3fc:, 3?&tttt£i5fttt$£ffi 

[0 12 3] ±JBL;fcHi©fflgH\ A-KH;x7ti 
a-dr, $/c^ M^ya^7i,^>Xh-;H-§ 30 

ffl©' <- v ™, 3 > tr ^ - # 4 H fc H f SSS&ttfr 6. -c > 
CO l 2 4] c©i2Sl$if*«\ 02 0fc;jrf a 

X* 1 3 1 (7D-ytff ; V7.^^^) x 1 
3 2 (CD-ROM (Compact Disk-Read Only Memory) , DVD 
(Digital Versatile Disk) , Tfr&^x-fX^ 

13 3 (MD (Mini-Disk) £#t?) , £ L < 40 

*l> l 3 AKZ^ftzAytr-y^jr^^fefr 

R0M1 1 2-f>feffig|H 1 8tf#Stl§A-Kf^X^4 
CO 1 2 5] fcfc, «IiIl:j5^T, SStfrtC J: *3 ffiflt 
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CO 1 2 6] $fc, ffl«MI»C*'^T % '>XfA^!i, 

So 

CO 1 2 7] 

lau ^©aawsa, B«©gBiai£*©5 

5<fc 3 Lfc©T\ J; t) *J£©i*^?n»teJ;tf B« 
C0M©fgm;5:8KBE] 

C0 i ] *fgB^jifflL/in#>y bcD-mmmmnw. 

[S2]iitf)n,f7h 0rt8M»]«09*jS"r 7d y * 0 
C03] 02©=iyhn-^i o<DMgga^Wi£0!f^-r 
C0 4] e^*i«l:tfH«*B»LfTft*»£'rsffl^t 
C0 5] #jBS»gf!3 1 A©ftgpfi|fig^-r 7d-y^0 

im 6 ] of ftisaisi! 3 1 b ortsuw^^-r 7*0 -y * a 
C0 7 ] fTi&^«siap 3 3 ©rtaww^r^a <y * 

0T-£S o 

cms] iii^ietgg|5 7 7ictmztir^z>mmic-D 
im 9 ] mmmmmffi 7 s \cmmztix^m^ 
cai 0] w^iiMtco^T^-rs^p-f-^- 

[HI 1] HfilMMtco^T^-rs^D-^-v- 
hT'J5So 

[hi 2] »^saii{co^TM-rs7P-^-^- 

[01 3] ^&tt«£®ttW«i:*J8^T, IS^SH^ 
fcH fl t % if-^ <0 if O I > T ^ f S 7 n - + - h T* 

[014] ^1iffi£®«1tfBi:£/8^T, ISISIS^* 

tb 7j t s i§ ^ o m<D mm t o ^ r t s 7 a - ^ + - 

hTfeSo 

[01 5] S^fll*ii5»fll«t*ffli^T, SS90Sm« 

[016] n— tf £o4?-y h 1 ofi[iaBa«tot^Ta«ffii 
•rs0T'S§o 

[017] w^igiigP3 1 A©f&o#i^*^-rHTfe 

So 

[0 18] »ttS™«7 7fc£*SftT^Sffi©ftfl: 



21 

cm2o] nfo*mm?z>mT°&z> a 

1 0 a>Fo-7, 1 0 A CPU, 



[01] 
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* ■;. is 


VY^, 16 CCD, 1 7 ftytty 


£Bfc -9-, 18 


1 9 iifggp, 3 i 


"T -+-t /in rmsfcr? 


3 1 A ^PmWffi, 3 1 B Sjftlg 


m. 3 1 


C E^j&fflp, 3 2 aH»/*^tf* 


3 3 


fris^nm 3 4 mm&mm 


^ * gp, 35 


mmm, 3 6 gjs^gas 


[04] 


[017] 




31 3] 



[02] 



\ 3CAi~3CAk 
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[08] 









•ft 








i-tf(cis^< , litis, mm, 
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•KftSrt* 





[01 2] 



SflWWU. 




1 S22 







a I s 



[09] 



31 l] 



T 



[0183 







2ffl^- 


3}fl^ 


4»if 




0 
70 
120 


- 60 

-120 
-180 


MO A 
HOB 


rati a . 

C 

»*> F 


M4A 
MOD 

hog 


H4A 
■ OA 
B9A 



< 
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imi 0] 






YES 




SB 













[016] 



[019] 







ifrSfcA- 






»^<rA- 






-enEm- 




ffS ttA-» 






fcnttA- 










ifSfcA- 




< u** 




*©j5r*isj< 


**A - 


TBless you! J if •J 




hWA- 


TGeeundheitiJ tf^ 










i^TOHEdbSJII 6TS7 #35^ v - 

^«fPp a pJIIEdfcp D pjl|6Tg7§35^ y- 

(nmmm mm as? 

^H3p a pHIEJt:p a oJll 6 Tg 7 #35^ y- 



F 3F059 AAOO BA02 BB07 BC06 CA05 

CA06 DA02 DA03 DA05 DA09 
DB02 DB09 DC01 DC04 DC07 
DD01 DD06 DDI 8 FA03 FA05 
FB01 FC07 FC14 FC15 

5B057 AA05 BA02 CA08 CA12 CA16 
DA 17 DB02 DB09 DC08 DC 16 
DC22 DC36 

5D015 KK01 LL07 

5L096 AA06 CA14 DA02 FA06 FA 14 
FA66 FA67 HA03 

9A001 HH17 HH19 HH20 



